42 research outputs found

    Online backchannel synthesis evaluation with the switching Wizard of Oz

    Get PDF
    In this paper, we evaluate a backchannel synthesis algorithm in an online conversation between a human speaker and a virtual listener. We adopt the Switching Wizard of Oz (SWOZ) approach to assess behavior synthesis algorithms online. A human speaker watches a virtual listener that is either controlled by a human listener or by an algorithm. The source switches at random intervals. Speakers indicate when they feel they are no longer talking to a human listener. Analysis of these responses reveals patterns of inappropriate behavior in terms of quantity and timing of backchannels

    Selecting appropriate agent responses based on non-content features

    Get PDF
    This paper describes work-in-progress on a study to create models of responses of virtual agents that are selected only based on non-content features, such as prosody and facial expressions. From a corpus of human-human interactions, in which one person was playing the part of an agent and the second person a user, we extracted the turns of the user and gave these to annotators. The annotators had to select utterances from a list of phrases in the repertoire of our agent that would be a good response to the user utterance. The corpus is used to train response selection models based on automatically extracted features and on human annotations of the user-turns

    The effect of multiple modalities on the perception of a listening agent

    Get PDF
    Listening agents are IVAs that display attentive listening behavior to a human speaker. The research into listening agents has mainly focused on (1) automatically timing listener responses; and (2) investigating the perceptual quality of listening behavior. Both issues have predominantly been addressed in an offline fashion, e.g. based on controlled animations that were rated by human observers. This allows for the systematic investigation of variables such as the quantity, type and timing of listening behaviors. However, there is a trade-off between the control and the realism of the stimuli. The display of head movement and facial expressions makes the animated listening behavior more realistic but hinders the investigation of specific behavior such as the timing of a backchannel. To migitate these problems, the Switching Wizard of Oz (SWOZ) framework was introduced in [1]. In online speaker-listener dialogs, a human listener and a behavior synthesis algorithm simultaneously generate backchannel timings. The listening agent is animated based on one of the two sources, which is switched at random time intervals. Speakers are asked to press a button whenever they think the behavior is not human-like. As both human and algorithm have the same limited means of expression, these judgements can solely be based on aspects of the behavior such as the quantity and timing of backchannels. In [1], the listening agent only showed head nods. In the current experiment, we investigate the effect of adding facial expressions. Facial expressions such as smiles and frowns are known to function as backchannels as they can be regarded as a signal of understanding and attention

    Online behavior evaluation with the switching wizard of Oz

    Get PDF
    Advances in animation and sensor technology allow us to engage in face-to-face conversations with virtual agents [1]. One major challenge is to generate the virtual agent’s appropriate, human-like behavior contingent with that of the human conversational partner. Models of (nonverbal) behavior are pre-dominantly learned from corpora of dialogs between human subjects [2], or based on simple observations from literature (e.g. [3,4,5,6]

    Аналіз провідних складових маркетингового потенціалу аграрного сектора економіки

    Get PDF
    Визначено та проаналізовано провідні складові маркетингового потенціалу аграрного сектора економіки: трудовий (кадровий), фінансовий потенціал; організаційне, управлінське та інформаційне забезпечення маркетингу; потенціал маркетингових підрозділів. Акцентовано увагу на відсутності маркетингових підрозділів у сільськогосподарських підприємствах, низькому рівні розвитку цього потенціалу в цілому. Наголошено на необхідності розробки заходів щодо підвищення ефективності господарювання підприємств аграрної галузі із застосуванням маркетингуCertainly and the leading constituents of marketing potential of agrarian sector of economy are analyzed. Attention is accented on absence of marketing subsections in agricultural enterprises low level of development of this potential on the whole. It will provide development of measures on the increase of efficienc y of menage enterprises of this industry

    Response Selection and Turn-taking for a Sensitive Artificial Listening Agent

    Get PDF
    Communication with a machines is inherently not ‘natural’, but still we prefer to interact with them without learning new skills but by using types of communication we already know. Ideally, we want to communicate with a machine just as we communicate with people: we explain (using our voice and gestures) what we want the machine to do, and it understands this and performs the required task. In its simplest form, such a dialogue system receives the user’s input as written text, which it has to parse and analyze to extract the intentions of the user. But a more complex dialogue system can perceive the user via a microphone and a camera, and the user can use normal speech and gestures to explain his or her intentions. However, this means that the system has to take other aspects of human conversation into account besides interpreting the user’s intentions. For example, it has to manage correct turn-taking behaviour, it has to provide feedback, and it has to manage a correct level of politeness. This thesis focusses on two aspects of the interaction between a user and a virtual agent (a dialogue system with a visual embodiment), namely the perception of turn-taking strategies and the selection of appropriate responses. This research was carried out in the context of the SEMAINE project, in which a virtual listening agent was built: a virtual agent that tries to keep the user talking for as long as possible. Additionally, the system consists of four specific characters, each with a certain emotional state: a happy, a gloomy, an aggressive, and a pragmatic one. These characters also try to get the user in the same emotional state as they themselves are in. Turn-taking is a good example of something that is completely natural for most people, but very hard to teach a system. And while most dialogue systems focus on having the agent’s responses start as soon as possible after the user’s end of turn without overlapping it, evidence indicates that starting too early or too late is not always inappropriate per se. People might start speaking too early because of their enthusiasm, or they might start later than usual because they are thinking. This thesis describes the study of how different turn-taking strategies used by a dialogue system influence the perception that users have of that system. These turn-taking strategies are different start times of the next turn (starting before the user’s turn has finished, directly when it finishes or after a small pause) and different responses when overlapping speech is detected (stop speaking, continue normally or continue with a raised voice). These strategies were evaluated in two studies. In the first study, a simulator was cre- ated that generated conversations by having two agents ‘talk’ to each other. The turn-taking behaviour of each agent was scripted beforehand, and the resulting conversation was played by using non-intelligible speech. After listening to a simulated conversation, the users had to complete a questionnaire containing semantic differential scales about how they perceived a participant in the conversation. In the second study, the users actively participated in the conversation themselves. They were interviewed by a dialogue system, but the exact timing of each question was controlled by a human wizard. This wizard varied the start time of the questions depending on the selected strategy of that particular interview, and after each interview the users had to complete a questionnaire about how they perceived the dialogue system. These studies showed that starting too early (that is, interrupting the user) was mostly associated with negative and strong personality attributes: agents were perceived as less agreeable and more assertive. Leaving pauses between turns had contrary associations: it was perceived as more agreeable, less assertive, and created the feeling of having more rapport. It also showed that different strategies influence the response behaviour of the users as well. The users seemed to ‘adapt’ to the interviewing agent’s turn-taking strategy, for example by talking faster and with shorter turns when the interviewer started early during the interview. The final part of the thesis describes the response selection of the listening agent. We decided to select an appropriate response based on the non-verbal input, rather than on the content of the user’s speech, to make the listening agent capable of responding appropriately regardless of the topic. This thesis first describes the handcrafted models and then the more data-driven approach. In this approach, humans annotated videos containing user turns with appropriate possible responses. Classifiers were then used to learn how to respond after a user’s turn. Different methods were used to create the training data and evaluate the results. The classifiers were tested by letting them predict appropriate responses for new fragments and let humans rate these responses. We found that some classifiers produced significantly more appropriate responses than a random model

    Using Context to Disambiguate Communicative Signals

    Get PDF
    After perceiving multi-modal behaviour from a user or agent a conversational agent needs to be able to determine what was intended with that behaviour. Contextual variables play an important role in this process. We discuss the concept of context and its role in interpretation, analysing a number of examples. We show how in these cases contextual variables are needed to disambiguate multi-modal behaviours. Finally we present some basic categories in which these contextual variables can be divided

    Turn Management or Impression Management?

    Get PDF
    We look at how some basic choices in the management of turns influence the impression that people get from an agent. We look at scales concerning personality, emotion and interpersonal stance. We do this by a person perception study, or rather an agent perception study, using simulated conversations that systematically vary basic turn-taking strategies. We show how we can create different impressions of friendliness, rudeness, arousal and several other dimensions by varying the timing of the start of a turn with respect to the ending of the interlocutor’s turn and by varying the strategy of ending or not ending a turn when overlap is detected
    corecore